Skip to content

Conversation

@qnixsynapse
Copy link
Collaborator

@qnixsynapse qnixsynapse commented Mar 1, 2025

Moved the CPY kernels to a separate file for better maintainability as well as tried to add few missing type conversation kernels including dequant cpy kernels:

  • Q8_0 -> F32
  • Q4_0 -> F32
  • Q4_1 -> F32
  • F32 -> Q5_0
  • Q5_0 -> F32
  • F32 -> Q5_1
  • Q5_1 -> F32
  • F32 -> IQ4_NL

Updated the backend device support check in ggml-sycl.cpp to enable these new type combinations.

Locally, the test-backend-ops is passing with this change.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Mar 1, 2025
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

Copy link
Collaborator

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks nicer

@Rbiessy Rbiessy merged commit ece9745 into ggml-org:master Mar 3, 2025
47 checks passed
@qnixsynapse qnixsynapse deleted the cpy branch March 3, 2025 13:57
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
…ggml-org#12133)

* SYCL: refactor and move cpy kernels to a separate file

* Add few missing cpy kernels

* refactor and add debug logs
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025
…ggml-org#12133)

* SYCL: refactor and move cpy kernels to a separate file

* Add few missing cpy kernels

* refactor and add debug logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants